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We introduce a rigorous method to microscopically compute the observables which characterize the thermo- 
dynamics and kinetics of rare macromolecular transitions for which it is possible to identify a priori a slow 
reaction coordinate. In order to sample the ensemble of statistically significant reaction pathways, we define a 
biased molecular dynamics (MD) in which barrier-crossing transitions are accelerated without introducing any 
unphysical external force. In contrast to other biased MD methods, in the present approach the systematic er- 
rors which are generated in order to accelerate the transition can be analytically calculated and therefore can be 
corrected for. This allows for a computationally efficient reconstruction of the free-energy profile as a function 
of the reaction coordinate and for the calculation of the corresponding diffusion coefficient. The transition path 
time can then be readily evaluated within the Dominant Reaction Pathways (DRP) approach. We illustrate and 
test this method by characterizing a thermally activated transition on a two-dimensional energy surface and the 
folding of a small protein fragment within a coarse-grained model. 



I. INTRODUCTION 

The recent developments in single-molecule optical- and 
force- spectroscopy allow to experimentally characterize the 
thermodynamics and kinetics of many fundamental biomolec- 
ular reactions to an unprecedented level of accuracy. For ex- 
ample, pulling experiments based on optical tweezers JT] or 
atomic-force microscopy [2] can provide the full free-energy 
profile of biopolymers as a function of their end-to-end dis- 
tance Q, while the single-molecule Foster Resonance Energy 
Transfer spectroscopy yields the reaction rate [4| and, very re- 
cently, the transition path time (TPT) . 

The possibility of measuring these observables poses the 
challenge to predict their dynamics from microscopic atom- 
istic simulations. Unfortunately, the MD algorithms are very 
inefficient to this purpose, because they require to simulate 
time intervals which are exponentially long in the free-energy 
barrier. 

These limitations have motivated the development of alter- 
native theoretical frameworks to investigate the free energy 
landscape [6-8 1 and reaction kinetics properties ll9l- [T8l of ac- 
tivated reactions. Some of these methods — such as e.g. the 
meta-dynamics approach [6] — involve a suitable choice of a 
set of reaction coordinates which are used to bias and accel- 
erate the exploration of the energy landscape. By contrast, 
methods like transition interface sampling ifTBI . milestoning 
ifTTl or dynamics Monte Carlo [18] sample directly the space 
of reactive pathways, without introducing a bias on the dy- 
namics. On the other hand, these methods are in general com- 
putationally quite costly. 

In a recent work, a variant of the Dominant Reaction 
Pathway (DRP) method ifTWTl has been developed l22l . 
which generates statistically significant protein folding path- 
ways by combining an accelerated MD algorithm 1 23 1 with a 
path-integral based variational approach. Using the acceler- 
ated MD, several hundreds of folding trajectories for single- 
domain proteins of typical size can be generated in just a few 
hundreds of CPU hours. The trial paths are then ranked in 
terms of their statistical weight in the (unbiased) over-damped 



Langevin dynamics, and the most probable (i.e. least biased) 
trajectories among them are identified. This way, many of 
such so-called dominant reaction pathways, each correspond- 
ing to a different initial condition, have been computed for a 
WW protein domain using a realistic force field [22 1 . These 
paths were found to agree very well with those obtained by 
Shaw and co-workers within the same force field, by means 
of ultra-long MD simulations on the Anton special-purpose 
machine l24ll . 

The high efficiency of the DRP method comes from the fact 
that its computational time does not scale exponentially with 
the free-energy barriers. By this method it is now possible to 
atomistically study the dynamics of polypeptides with realis- 
tic size and kinetics, and even simulate the folding of complex 
knotted proteins fl25ll . 

The main limitation of this approach is that, since the ac- 
celerated dynamics used to generate the ensemble of reaction 
pathways breaks microscopic reversibility, it cannot be used 
to directly compute kinetic and thermodynamic observables. 
Due to this problem, to date the DRP method has been used 
only to investigate the reaction mechanism, for example by 
characterizing the transition state ensemble. 

In this work, we overcome this limitation. We devise a rig- 
orous scheme to compute the transition path time and evaluate 
the potential of mean-force of a previously determined slowly- 
evolving collective coordinate (CC). The method is based on a 
new type of accelerated MD algorithm, herby called hindered 
molecular dynamics (hMD). In contrast to other methods, in 
the hMD algorithm, no external force is introduced to speed 
up the reaction. In addition, the effect of the bias on the evo- 
lution of a slow reaction coordinate can be rigorously and an- 
alytically computed, hence can be corrected for. As a result, it 
is possible to extract the potential of mean-force and the dif- 
fusion coefficient of the reaction coordinate from a set of suit- 
able averages evaluated over the reactive trajectories obtained 
from the biased dynamics. Once these quantities have been 
determined, the transition path time can be easily computed 
employing the DRP formalism. 

The paper is organized as follows. In the next section we 
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introduce the hMD dynamics and show how to extract the po- 
tential of mean-force and the diffusion constant of a slow re- 
action coordinate. In section [HI] we discuss how to compute 
the transition path time in the DRP formalism. Section IV 
provides two illustrative applications to systems of increasing 
complexity. Finally, results and conclusions are summarized 
in section [V] 



II. COMPUTING THE POTENTIAL OF MEAN-FORCE 
AND THE DIFFUSION COEFFICIENT OF A CC FROM 
HMD SIMULATIONS 

Let us begin by considering the over-damped Langevin 
equation which mimics the microscopic dynamics of the 
molecule in a solvent. In the so-called Ito calculus this equa- 
tion is defined as: 



x i+ i = xi - (At/y)VU(xi) + V2DAt r|i, 



(1) 



where yis the viscosity, D = ^y- is the diffusion coefficient, x, 
is the point in the 3A^-dimensional configuration space visited 
at the i—th time step, U (x) is the potential energy and r|(f ) is a 
stochastic variable sampled from a Gaussian distribution with 
with zero average and unitary variance. The high-friction limit 
which underlies the over-damped Langevin Eq. ([T| is appro- 
priate for many systems of biophysical interest. For example, 
in proteins, the effects of the acceleration affects the dynamics 
only at time scales smaller than few fractions of ps [26 1. 

We now introduce the hMD, which is closely related to the 
ratchet-and-pawl MD[23| used to generate trial paths in our 
previous protein folding DRP calculations |22|. The main dif- 
ference between the two algorithms is that, in the hMD, no un- 
physical external force is introduced to disfavor fluctuations in 
the direction of the reactant. Instead, when the system tries to 
evolve backwards along a reaction coordinate, the dynamics 
is slowed down by increasing the viscosity and the decreasing 
the heat-bath temperature. 

Namely, denoting with z(x) a configuration-dependent CC 
— assumed for definiteness to monotonically increase from 
the reactant to the product — , the hMD is defined by the fol- 
lowing stochastic differential equation: 

At. 



Xi+i = 6[z(x i+1 )-z(xi)] \Xi - —VU(xi) + V2DMT], 

+ e[z(xi)-z(x i+1 )] (xi - ^VU(xi) + i V2DAtt\i 

(2) 

where t, > 1 is called the hindering coefficient. 

By scoring the trajectories generated by the hMD |2]i ac- 
cording to the path probability of the unbiased Langevin dy- 
namics ([1} one can efficiently obtain an ensemble of statis- 
tically representative reaction pathways, see Ref. 11221 . Un- 
fortunately, the time intervals of the hMD (|2| are not physi- 
cally meaningful and dominant pathways alone do not allow 
to compute free-energy differences. 

In order to overcome this problem and establish the con- 
nection with kinetics and thermodynamics our strategy is to 



analyze the average time evolution of some slow reaction co- 
ordinate Q. In general, such a collective variable does not nec- 
essarily need to coincide with the biasing variable z, but can 
be related to it by a constant scaling factor, which produces a 
rescaling of the diffusion coefficient. 

In the limit in which the spontaneous time evolution of the 
CC Q very slow compared to that of all microscopic degrees 
of freedom, its (unbiased) dynamics can be described by an 
effective over-damped Langevin equation: 



Q m = Qi - (Af/Y e ) G'{Qi) + yj2k B TAt/y Q (3) 

where Jq and Dq = kgT /jq the respectively the viscosity and 
the diffusion coefficient of the CC. In the following we re- 
strict to the case in which the diffusion constant is assumed 
to be state independent (white noise). The generalization to 
colored noise is possible but it requires a careful choice of 
the stochastic calculus, and will not be considered in this first 
work. 

In an hMD simulation, any time the system evolves towards 
a smaller value of the Q (hence of z), the dynamics is slowed 
down by the same rescaling of the diffusion coefficient and 
temperature of the underlying microscopic dynamics. Hence, 
the equation of motion of Q in a hMD simulation is given by: 



Qi+i-Qi = (-^G'm + y/lDQAtv^ 



(4) 



i I |-i |e 



At 



2y Q k B T 



G'(Qi)-m 



where the role of the step-function is to hinder the dynamics 
when the fluctuation would drive the reaction backwards. 

From the stochastic differential Eq. Q it is immediate to 
compute the probability for the system to evolve from a con- 
figuration with CC Qj to one with CC Qi+\, in a elementary 
step Af of hMD simulation: 



-? 2 (Ae+^G'(Ci)) 2 



e[-Afi] 



-(A6+ 



+e 



e[Ag] 



(5) 



where AQ = Q M - Q; and # = y 4 ,A,t B r - 

Using this equation we compute the average infinitesimal 
displacement of the CC Q in a time interval At of hMD, start- 
ing from configurations in which the CC takes a value Q: 



(Ae(Q)W = J^^-^ G '(G)^ + ... (6 ) 

Similarly, the average square displacement {AQ 2 {Q))i 1 md 
reads: 



(AQ 2 (Q))hMD 



Atk B T 1+^ 2 

Ye ¥ 



(7) 
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In both equations (|6| and (j7]i the dots denote corrections of or- 
der Af 3 / 2 . If the biasing coordinates is defined in such a way to 
decrease (rather than increase) from the reactant to the prod- 
uct, the first term in the right-hand-side of Eq. (|6|l changes 
sign. The averages in Eq.s ([6]) and (j7]i can be efficiently eval- 
uated, hence allowing to determine yg and G(Q). 



III. COMPUTING THE TRANSITION PATH TIME 

In the previous section we have discussed how it is possible 
to compute the potential of mean-force G(Q) and the diffusion 
coefficient Dq, by evaluating averages over microscopic tra- 
jectories obtained in the hMD. We now show that, once these 
quantities have been determined, it is possible to restore the 
correct time scales in the calculated reactive trajectories, by 
applying the DRP formalism to the stochastic projected dy- 
namics of the CC defined in Eq. (p). 

The starting point of the DRP approach is the path integral 
representation of the conditional probability of going from Qj 
to Qf in time f: 



¥(Qf,t\Qi)=e 



G{Q f )-G(Q t ) 
2k R T 



Of 



a 



DQ e 



effV- 



(8) 



where 



s 'fM=£ d *(J^+ v 'ff\Q]) W 



is called the effective action and 
Dq 



Veff(Q) = 



G'(Q)\ 2 -2k B TG"(Q)) 



(10) 



is called the effective potential. The DRP equations result 
from analyzing the path integral <|8j in saddle-point approx- 
imation. The saddle-point paths (called the dominant reaction 
pathways) are the functional minima of the effective action. 
Hence, they obey the equation of motion 



Q = 2D Q V' eff {Q) 



(11) 



and conserve the effective energy E e ff = — V e ff[Q]. 

As discussed in detail Ref.s ETl l27l the saddle-point paths 
which are relevant in the description of thermal activation are 
those which leave the reactant and reach the product with 
(nearly) vanishing velocity. This observation implies 



V/' 



V eff {Qi) ~-G"(&-), 



(12) 



where we have used the fact the initial configuration Q, is in 
the vicinity of a free energy minimum. On the other hand, 
outside the (meta-) stable thermodynamical states, the effec- 
tive potential is dominated by its force contribution, 



V ef f(Q)~l/(Vc B Ty Q )\G'(Q)\ 2 . 
Hence, in the transition region E e ff /V e ff(Q) ~ o(IcbT). 



(13) 



The definition of effective energy immediately yields the 
time at which any given intermediate value Q of the CC, lo- 
cated between the reactant Q B and the product Qp, is visited 
by during a reaction |fT9ll2TI : 



t{Q) 



dQ 



Q* J4D Q (E eff + V ef f[Q]) 



(14) 



This equation can also be used compute the time at which the 
microscopic configurations in the dominant trajectories are 
reached, by imposing t (x) ~ t(Q(x)). In particular, Eq. ( 14 1 



provides an estimate of the TPT, which is obtained simply by 
setting Q = Qp and E eff ~ - V e ff(Q R ). 



A. Transition path time for crossing a harmonic barrier 

It is useful to discuss the TPT calculation within the simple 
harmonic approximation of a single free-energy barrier, which 
allows for an analytic treatment. In this case, the effective 
potential V e ff(Q) is also a harmonic function and reads 



Veff(Q) 



oc- 



4k B Ty Q 



■{Q-Qts? + 



a 

2Yq' 



(15) 



where Qts identifies the transition state and a = G"(Qts)- 

In an harmonic barrier, the transitions which involve over- 
coming of an energy barrier AG are those initiated by a point 

Qi such that \Q T s-Qi\ = 
ately gives 



Hence, Eq. ( 14 1 immedi 



tfPT — — In 
a 



( V2AGa + 2^k B T[E eff +^{\- 



AGyj 
k B T>\ 



2k B T(a + 2E eff ) 



16) 



Finally, retaining only the leading-order in the expansion 
in powers of the thermal energy k B T and recalling that 
Eeff/Veff ~ o(k B T) we arrive to the final simple result, 



t TPT = ±]n\4AG/(k B T)}. 
a 



(17) 



which is close to the estimate obtained by Szabo, tjpT — 
In [3 AG/ (k B T)]. It should be emphasized however that Eq. 
(114]) generalizes this estimate beyond the harmonic approxi- 
mation and the leading-order in the low-temperature expan- 



IV. ILLUSTRATIVE APPLICATIONS AND VALIDATION 

For illustration and validation purposes, in the remaining of 
this work we apply and test our method to characterize two 
reaction of increasing complexity. We begin by considering a 
simple two-dimensional toy model which can be straightfor- 
wardly implemented and for which analytic solution for the 
potential of mean-force exist. 
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FIG. 1: (Color online) The energy surface of the two-dimensional 
toy model used to validate the method. 



Next, we use our method to study a protein folding reac- 
tion within a coarse-grained representation of the polypeptide 
chain. In such a model, the folding reaction can be simulated 
directly by integrating the equation of motion and the results 
can be used to assess the accuracy of our method. 



A. Diffusion on two-dimensional funneled potential 

We consider the over-damped Langevin diffusion of a 
point-particle in the two-dimensional funneled energy surface 
shown in Fig{T] which is given by the potential: 



U(x,y) 



-Aio? 



A 2 o? 



W(x 2 



y 2 )+o 2 ) 2 



y 2 ) 2 +B sin 



((x 2 +y 2 )+a 2 ) 

2 I 0' 



(j) = arctan (y/x) , 
(18) 



with A i =20,A 2 = 10, a, = 1, o 2 = 5, a> = 0.02 andB= 10. 
As shown in Fig. [TJthis model contains a stable state at the ori- 
gin and meta-stable state at some finite distance from the bot- 
tom of the funnel and ~ 0. For kgT — 1, the barrier-crossing 
transition from the meta-stable to the stable state is thermally 
activated. The only slow coarse coordinate in this system is 
the distance of the particle from the origin, R = \Jx 2 +y 2 and 
the dominant reaction pathways are straight lines connecting 
the different initial conditions in the meta-stable state to the 
origin. By contrast, a typical Langevin trajectory spends a 
large time in the metastable state, performs a barrier crossing 
transition and eventually lands in the stable state. 

The mean first-passage-time through the transition state, 
obtained from the Langevin simulations is (tFpr)MD — 532 ± 
40 (units in which y= 1). The mean TPT can be calculated 
by measuring the length of an ensemble of Langevin trajecto- 
ries which are started at the edge of the reactant -arbitrarily 
defined by the condition Rr = 5- and are terminated once the 
particle reaches the edge of the product- identified by the con- 
dition Rp = 0.5. Accumulating statistics only on the trajecto- 
ries which do not visit the reactant before reaching the product 
we find (Itpt)md = 2.65 ±0.02. 

We now use Eq.s (|6]l and (j7]i to reconstruct the free energy 
landscape as a function of the CC z = R- This is done by 
running hMD simulations which bias the dynamics towards 
smaller and smaller distances from the origin, according to the 
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FIG. 2: (Color online) Upper panel: (AR 2 (R)} in the toy model, eval- 
uated in hMD (circles) and MD (squares) simulations. Lower panel: 
(AR(R)) in hMD simulations. All quantities are in computer units. 



algorithm given in Eq. ([2}. With a hindering coefficient \ = 1, 
generating a barrier crossing event requires simulating a time 
interval of about 0.4, which is a factor 10 3 times smaller than 
the mean-first-passage time. hMD trajectories are functionally 
close to the dominant reaction pathway, i.e. to the straight 
radial line with = 0. 

According to our method, the first step towards reconstruct- 
ing the free-energy surface consists in evaluating the friction 
coefficient for the CC used in the hMD, by means of Eq. (j7]i. 
Fig. |2]shows (AR 2 (R)) evaluated in the hMD simulation with 
an elementary time interval At = 0.0005. As predicted by Eq. 
{7J, this curve is flat almost everywhere. A weak dependence 
on R is observed for R < 2, and is due to the fact that in the 
high-force region inside the funnel, gradient-dependent cor- 
rections to Eq. |7} which are higher order in At become rel- 
evant. From a fit of the flat region, knowing that \ = 2, one 
obtains the correct result Jr ~ 1. To assess the validity of this 
calculation, in Fig. [2] we also show the same average evalu- 
ated in standard (i.e. unbiased) MD simulations. According 
to Einstein's law this average should be equal 7 ^ L , which 
allows to confirm the result obtained from hMD. 

Finally, knowing and having determined Jr, it is pos- 
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FIG. 3: (Color online) Comparison between exact (solid line) and 
calculated (circles) free-energy profile as a function of the CC R. 
The dotted line represents the potential energy as a function of the 
coordinate R, evaluated along the radial line with polar coordinate 
ij) = 0. The dot-dashed line shows the DRP time at which each value 
of R is assumed. 



sible to use Eq. ^ to extract the mean-force G'(R) from 
the average displacement (AR(R)) shown in Fig. [2] hence 
to reconstruct G(R). The calculated free-energy is shown in 
Fig. [3] where it is compared with the exact analytic result, 
G(R) = U(R) — ksT\og -g-, - here, Rq is the arbitrary refer- 
ence point — . The agreement between the two curves is quan- 
titative. 



The TPT estimated using the DRP equation (14 1, set- 
ting the E e ff — —V e ff{R m ) -where R m is the minimum free- 
energy distance in the meta-stable state — gives tjpj ~ 3.4. 
This number is not far from the average (tjp^MD = 2.65 ± 
0.02, obtained by MD simulations. In contrast, Szabo's for- 
mula, which relies on the harmonic approximation and low- 
temperature expansion, gives tjf pT = 0.8, which is off by a 
factor 3. This discrepancy suggest that temperature effects 
and specific curvature of the energy surface at the transition 
state can give significant corrections to the TPT. 



B. The folding of a poly-petide chain 

To further assess the accuracy and computational efficiency 
of our method, we apply it to study a conformational reac- 
tion which resembles most of the difficulties which are en- 
countered in macro molecular systems: the folding of the 16 
amino-acid C-terminus of protein GB1, whose native state is 
shown in the inset of Fig(4] 

To allow for a direct comparison with the result of standard 
MD simulations, we adopt the coarse-grained representation 
introduced in Ref. fl28l . In such an approach, the explicit de- 
grees of freedom individual amino-acids and the energy func- 
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FIG. 4: (Color online) Potential of mean-force as a function of the 
fraction of native contacts for the C-terminal of protein GB1 (the 
native structure shown in the insert). 



tion is a sum of pair-wise interactions: 



U = ~£&(x*+i-x*|-a) 2 + £4e 



-(Gij+Bij) 



\xj -Xi\ 



\Xi-Xi 



12 



(19) 



where x, denotes the position of the i—th residue, a = 0.38 nm, 
k = 3000 kJ mor'nm- 2 , e = 5 kJ moF 1 , and o = 0.3 nm. Gy 
is the matrix of native contacts, i.e. Gy is set to 1 if the dis- 
tance between the residues i and /' in the crystal native confor- 
mation is less than 0.65 nm, and otherwise (Go-type model, 
||29l ). The coefficients Ay and By are introduced in order to 
account for the hydro-philic (-phobic) character of the amino- 
acids, as in the so-called HP model [ 30 1 . 
Namely, we set 



• Ay = 1 and By = 
are hydrophobic 



1, for pairs in which both amino-acids 



• Ay = | and By = —1, for pairs in which one of the 
amino-acids is polar 

• Ajj = 1 and By = if one of the residues is GLY, which 
is hydrophobic ally neutral. 

Such non-native interactions introduce ruggedness in the en- 
ergy landscape, making simulations more challenging than in 
the purely native-centric model. 

The time evolution for the fraction of native contacts Q 
(which is a commonly adopted reaction coordinate for protein 
folding) over a long Langevin trajectory is shown in Fig. [5] 
The curve shown in the upper panel is compatible with a two- 
state system, separated by a single low free-energy barrier. 
This fact is confirmed by the points shown in Fig. [4] which 
represent the potential of mean-force for this system as a func- 
tion of Q obtained from a frequency histogram of the same tra- 
jectory. The lower panel of Fig. [3] shows the evolution of the 
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FIG. 5: (Color online) The time evolution of the fraction of native 
contacts Q of the poly-peptide obtained from a long Langevin sim- 
ulation with reversible folding-unfolding events (upper panel). The 
lower panel shows in detail the evolution of this variable along a typ- 
ical folding event. The integration of the (Ito) Langevin equation 
(|T]( was performed at the nominal temperature of T = 200 K, with an 
integration time step At = 0.0 lps and a viscosity y= 2000 amu ps . 



fraction of native contacts along a typical folding event. The 
average transition path time for folding reactions was found 
to be xfP T = (0.50 ±0.05) ns. 

Let us now discuss the calculation of the potential of mean- 
force using the hMD algorithm. We performed 800 indepen- 
dent short hMD simulations with an hindering coefficient of 

= 3, starting from the fully stretched configuration. The tra- 
jectories were biased using a CC which counts the number of 
native contacts: 



1 



where z„ = £ i<; - G;j, 



— E(C(*i,*/)-Gy) 

Z " i<j 



C(x x) - i-flx.-XjDAo) 6 



(20) 



(21) 



and ro = 0.7 nm. This biasing biasing CC was shown to be 
very efficient in guiding ratchet-and-pawl protein folding sim- 
ulations G2ll23l . 

In order to ease the physical interpretation of the results, 
we choose to reconstruct the free-energy surface as a func- 




FIG. 6: (Color online) Correlation between the ratchet variable z 
and the fraction of native contacts Q along the hMD trajectories used 
to reconstruct the free-energy surface. 



tion of the fraction of native contacts Q, rather than of the 
biasing variable z. We recall that, in order for the same hMD 
procedure to be transferable from the biasing coordinate to a 
reaction coordinate, the two variables should be directly pro- 
portional. Unfortunately, in general, this is not the case for Q 
and z, as it is evident from the fact that the biasing variable can 
be increased not only by forming a native contact, but also by 
breaking a non-native contact. However figure [6] shows that, 
while the linear correlation between z is Q can be violated in 
general, it is actually respected to good accuracy along the 
hMD folding trajectories, hence allowing the application of 
Eq.s Q and 0. 

From the ensemble of hMD trajectories we have evaluated 
the average and average-square displacements of the fraction 
of native contacts, (A<2(<2)) and (AQ 2 (Q)). The most straight- 
forward way to reconstruct G(Q) and compute Jq from these 
averages consists in using Eq. (|7ji to fit yg (hence compute 
the diffusion coefficient Dq = kgT /Jq ), and then insert this 
value into Eq. ^ to extract G'(Q). However, in the case of 
the present system, we have observed that higher-order cor- 
rections to Eq. ^ introduce some modulation in (AQ 2 (Q)), 
which spoil the accuracy of an estimate of Jq based on a con- 
stant fit. We have therefore developed a different protocol to 
evaluate G(Q) from the hMD averages, inspired by observa- 
tion that the hindering of the dynamics should be suppressed 
once the system crosses the TS. Indeed, beyond this point, the 
molecule is rapidly and spontaneously relaxing to the product 
state, hence we expect that the mean displacement (AQ(Q)) 
evaluated in hMD trajectories should undergo a rapid increase 
at the TS. 

Fig. [Tjshows the calculated (AQ(Q)) — evaluated over the 
time interval At = 10 ps — , which displays a steep increase in 
the region 0.47 < Q < 0.50, which represents our estimate for 
the location of the TS. Using G'(Q TS ) = 0, Eq. ^ leads to the 
estimate Jq = (3200 ± 1000) amu ps -1 . Once this parameter 
has been fixed, Eq. ^ yields G'(Q) in all other points. 

The results for the potential of mean-force are reported in 
Fig. |4] and show that the hMD algorithm is able to identify 



7 




from the DRP equation ( 14 1 using the calculated G(Q) and yg 
is shown in Fig. dSJ. . The dotted line shows the residence 



FIG. 7: (Color online) Average displacement of the fraction of native 
contacts (AQ) evaluated over a time interval At = 10 ps in the hMD 
simulation, as a function of the fraction of native contact Q. 




FIG. 8: (Color online) The time at which each value of the CC is 
visited, computed from the DRP equation l |14| l using the calculated 
G(Q) and Jq. The dotted line shows the residence time dt(Q), i.e. 
the integrand of Eq. l |14} . The initial condition is represented by the 
point on the left of the plot. 



the two-state character of the reaction kinetics and gives a 
free-energy barrier to fold which is in very good agreement 
with the results of MD simulations. On the other hand, the 
prediction for the free-energy profile is much less accurate in 
the region from the TS to the native state. This fact is ex- 
pected, since in the hMD approach the free-energy is obtained 
by comparing local fluctuations of the velocity in the presence 
and absence of the hindering. Clearly, beyond the TS the hin- 
dering is suppressed and the method becomes inaccurate. This 
does not represent a problem, since the unfolding free-energy 
barrier may in principle be calculated by the same algorithm 
using a different bias which drives the chain from the native 
to the denatured state. 

The time at which each value of the CC is visited, computed 



time dt(Q), i.e. the integrand of Eq. ( 14 1. We note that the 
residence time is longer in the region around Q ~ 0.5. The 
same feature is observed in the unbiased Langevin simulations 
(cfr. the lower panel in Pig j5]». The DRP estimate for the total 
TPT is Xjpj ~ 0.45 ±0.15 ns, again in good agreement with 
the results of the MD simulations, ijp T = (0.50 ±0.05) ns. 
We emphasize that the number of hMD trajectories needed 
to reconstruct G(Q) and compute Ijpt is of the same order 
of those which have been generated in our previous atomistic 
DRP simulations of protein folding. 



V. CONCLUSIONS 

In this work we have introduced an accelerated MD which 
allows to compute the free-energy profile G(Q) and the diffu- 
sion coefficient Dq which describe the stochastic dynamics of 
a previously determined slow collective variable Q. By apply- 
ing the DRP formalism we have shown that, once G(Q) has 
been calculated, it is straightforward to obtain an estimate of 
the TPT, which holds up to logarithmic corrections. 

The main advantages of the present method are (i) that the 
acceleration of the dynamics is not generated by any external 
force and (ii) that the systematic errors introduced in order to 
accelerate the overcoming of the free-energy barriers can be 
analytically computed, hence corrected for. 

A few remarks on the limitations of this method are in or- 
der. First, we emphasize that it relies on the possibility of 
identifying a single slow reaction coordinate for the macro- 
molecular system (a list of methods developed to this purpose 
can be found e.g. in Ref.s [31-36]). If a poorly chosen reac- 
tion coordinate is used, the free-energy profile is not expected 
to capture the correct rate-limiting barrier, and reaction rate 
calculation will be exponentially inaccurate. 

In general, it is not always possible to identify a single slow 
coarse variable (see e.g. in maze problem |37|). In these 
cases, our method is expected to miss substantial entropic con- 
tributions to the free-energy. Finally, we have found that the 
free-energy reconstruction is much less accurate in the region 
connecting the transition state to the product. 

All such limitations significant impact on the accuracy of 
free-energy profile calculations and reaction kinetics. On the 
other hand, the calculation of the TPT and of the time inter- 
vals between consecutive frames in DRP reaction paths are 
expected to be much more reliable, as the depend only loga- 
rithmically on the free-energy. 
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